[アップデート] Amazon Bedrock Knowledge bases が Binary Embedding をサポートしました

たかくに
2024.11.25
こんにちは！ AWS 事業本部コンサルティング部のたかくに（@takakuni_）です。
Amazon Bedrock Knowledge bases が Binary Embedding をサポートしました。
https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-bedrock-knowledge-bases-binary-vector-embeddings-rag-applications/
 Binary EmbeddingBinary Embedding は従来 float（浮動小数点 32 bit）で表現していたベクトルデータを、2 進数（1bit）で表現する埋め込み手法です。
2 進数（1bit）で表現し、読み込むデータを少なくすることで、以下の恩恵を受けられます。
ベクトルデータベースへの検索レイテンシーの向上
ベクトルデータベースのストレージコスト削減
ベクトルデータベースのメモリ消費量削減
ただし、float に比べ表現できる幅が減る（情報量が減る）ことになるため、精度が気になりますね。
AWS Blog では以下のように記載されていました。
MTEB データセットの場合を利用
Binary Embedding
ストレージ削減
25 倍のレイテンシー改善
OpenSearch Computing Units（OCUs）が 50%削減
re:Rank ありで 98.5%, re:Rank なしで 97%の検索精度

Float Embedding
re:Rank ありで 99.1%, re:Rank なしで 98.6%の検索精度


You can lower latency and reduce storage costs and memory requirements in OpenSearch Serverless and Amazon Bedrock Knowledge Bases with minimal reduction in retrieval quality.

We ran the Massive Text Embedding Benchmark (MTEB) retrieval data set with binary embeddings. On this data set, we reduced storage, while observing a 25-times improvement in latency. Binary embeddings maintained 98.5% of the retrieval accuracy with re-ranking, and 97% without re-ranking. Compare these results to the results we got using full precision (float32) embeddings. In end-to-end RAG benchmark comparisons with full-precision embeddings, Binary Embeddings with Amazon Titan Text Embeddings V2 retain 99.1% of the full-precision answer correctness (98.6% without reranking). We encourage customers to do their own benchmarks using Amazon OpenSearch Serverless and Binary Embeddings for Amazon Titan Text Embeddings V2.

OpenSearch Serverless benchmarks using the Hierarchical Navigable Small Worlds (HNSW) algorithm with binary vectors have unveiled a 50% reduction in search OpenSearch Computing Units (OCUs), translating to cost savings for users. The use of binary indexes has resulted in significantly faster retrieval times. Traditional search methods often rely on computationally intensive calculations such as L2 and cosine distances, which can be resource-intensive. In contrast, binary indexes in Amazon OpenSearch Serverless operate on Hamming distances, a more efficient approach that accelerates search queries.
Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases | AWS Machine Learning Blog
なお、Binary Embedding では、ハミング距離（配列の要素を比較し、異なる要素の数をカウントし比較する手法）を使って計算します。
 内容今回のアップデートは Amazon Bedrock Knowledge bases で Binary Embedding をサポートしました。
少し前に Amazon Titan Text Embeddings V2 自体で Binary Embedding をサポートし、追従する形で Knowledge bases でエンべディングするケースでもサポートしたようです。
https://aws.amazon.com/jp/about-aws/whats-new/2024/11/binary-embeddings-titan-text-embeddings-model-amazon-bedrock/
執筆時点で Amazon Bedrock Knowledge bases で Binary Embedding をサポートしたモデルは以下の通りです。
Amazon Titan Text Embeddings V2
Cohere Embed (English)
Cohere Embed (Multilingual)
https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-supported.html
また、 Binary Embedding を利用する場合、サポートしているベクトルデータベースは執筆時点で OpenSearch Serverless のみとなっています。
If you want to store binary vector embeddings instead of the standard floating-point (float32) vector embeddings, then you must use a vector store that supports binary vectors. Amazon OpenSearch Serverless is currently the only vector store that supports storing binary vectors.
https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html
 やってみるそれでは Binary Embedding を利用して、ナレッジベースを作ってみようと思います。
データソースなどの一連のリソースは、以下の Terraform のコードで作りました。
https://github.com/takakuni-classmethod/genai-blog/tree/main/knowledge_bases_binary_embeddings
 OpenSearchOpenSearch のインデックス設定は以下を参照して作成しました。
https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector/#example-ivf-1
https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector#binary-vectors
data_type, space_type が主な変更点です。OpenSearch のドキュメントでは data_type は binary ですが、OpenSearch Serverless の場合は、大文字指定（BINARY）する必要がありました。
aoss.tf
resource "opensearch_index" "this" {
  name          = "${local.prefix}-vector-index"
  index_knn     = true
  force_destroy = true
  mappings = jsonencode({
    dynamic_templates = [
      {
        strings = {
          match_mapping_type = "string"
          mapping = {
            fields = {
              keyword = {
                type         = "keyword"
                ignore_above = 2147483647
              }
            }
            type = "text"
          }
        }
      }
    ]
    properties = {
      "${local.prefix}-vector" = {
        type      = "knn_vector",
+        data_type = "BINARY",
        dimension = var.knowledge_bases.embeddings_model_dimensions,
        method = {
+          space_type = "hamming",
          engine     = "faiss",
          name       = "hnsw"
          parameters = {}
        }
      },
      AMAZON_BEDROCK_METADATA = {
        type  = "text",
        index = false
      },
      AMAZON_BEDROCK_TEXT_CHUNK = {
        type = "text"
      },
      AMAZON_BEDROCK_TEXT = {
        type = "text"
        fields = {
          keyword = {
            type = "keyword"
          }
        }
      },
      id = {
        fields = {
          keyword = {
            type = "keyword"
          }
        }
        type = "text"
      },
      x-amz-bedrock-kb-data-source-id = {
        fields = {
          keyword = {
            type = "keyword"
          }
        }
        type = "text"
      },
      x-amz-bedrock-kb-source-uri = {
        fields = {
          keyword = {
            type = "keyword"
          }
        }
        type = "text"
      }
    }
  })
  depends_on = [
    aws_opensearchserverless_security_policy.this_network,
    aws_opensearchserverless_security_policy.this_encryption,
    aws_opensearchserverless_access_policy.this_data
  ]
}
 Knowlege basesそれでは、ナレッジベースを作成してみます。IAM ロールは Terraform で作成されたものを選択します。
続いてデータソースの設定です。S3 も Terraform で作成されたリソースを選択します。ナレッジベースとデータソースで設定する S3 バケットは同一リージョンである必要があります。
ここが今日の醍醐味です。 Embeddings Type に Binary vector embeddings を選択します。
次元数は 1024 を選びました。他の次元数を選択したい場合は、 Terraform で作成した OpenSearch の Index に当たる部分も修正が必要なのでお気をつけを。
続いてベクトルデータベースの指定です。作成したベクトルストアを選択 を指定します。
Amazon OpenSearch Serverless 用ベクトルエンジン しか選択肢が選べないようになっていますね。
ベクトルデータベースの設定は以下を入力します。（OpenSearch Serverless 側で事前に作成したフィールドを入力しています。）
ベクトルインデックス名: bin-embed-vector-index
ベクトルフィールド名: bin-embed-vector
テキストフィールド名: AMAZON_BEDROCK_TEXT
Bedrock マネージドメタデータフィールド名: AMAZON_BEDROCK_METADATA
問題なければナレッジベースを作成します。
Embeddings type が Binary vector embeddings で上手く作れていますね。データソースを選択して同期を行います。
以下はモデル実行ログの抜粋です。
output.outputBodyJson.embeddingsByType が binary で表現されていますね。
ModelInvocationLog.json
{
	"schemaType": "ModelInvocationLog",
	"schemaVersion": "1.0",
	"timestamp": "2024-11-25T00:16:53Z",
	"accountId": "123456789012",
	"identity": {
		"arn": "arn:aws:sts::123456789012:assumed-role/bin-embed-kb-role/EmbeddingTask-YAR9Y3VNKC"
	},
	"region": "us-west-2",
	"requestId": "92857171-76b0-4fb3-9eff-a00547d4e30b",
	"operation": "InvokeModel",
	"modelId": "arn:aws:bedrock:us-west-2::foundation-model/amazon.titan-embed-text-v2:0",
	"input": {
		"inputContentType": "application/json",
		"inputBodyJson": {
			"inputText": "「おやおや、何という元気のいい子だろう。」  おじいさんとおばあさんは、こう言って顔を見合わせながら、「あッは、あッは。」とおもしろそうに笑いました。  そして桃の中から生まれた子だというので、この子に桃太郎という名をつけました。  引用：[楠山正雄 桃太郎](https://www.aozora.gr.jp/cards/000329/files/18376_12100.html)",
			"dimensions": 1024,
			"embeddingTypes": ["binary"]
		},
		"inputTokenCount": 148
	},
	"output": {
		"outputContentType": "application/json",
		"outputBodyJson": {
			"embeddingsByType": {
+				"binary": [0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0]
			},
			"inputTextTokenCount": 148
		}
	}
}
OpenSearch の中身も見てみましょう。
GET bin-embed-vector-index/_search
bin-embed-vector にエンべディングデータが保管されていますね。
{
  "took": 42,
  "timed_out": false,
  "_shards": {
    "total": 0,
    "successful": 0,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 25,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "bin-embed-vector-index",
        "_id": "1%3A0%3A6GitYJMBfoaIBvsnsP0C",
        "_score": 1,
        "_source": {
          "x-amz-bedrock-kb-source-uri": "s3://bin-embed-us-west-2-kb-datasource-123456789012/桃太郎第1章.txt",
          "AMAZON_BEDROCK_TEXT": "おばあさんはそこで、「あっちの水は、かあらいぞ。こっちの水は、ああまいぞ。かあらい水は、よけて来い。ああまい水に、よって来い。と歌いながら、手をたたきました。  すると桃はまた、「ドンブラコッコ、スッコッコ。ドンブラコッコ、スッコッコ。」といいながら、おばあさんの前へ流れて来ました。  おばあさんはにこにこしながら、「早くおじいさんと二人で分けて食べましょう。」と言って、桃をひろい上げて、洗濯物といっしょにたらいの中に入れて、えっちら、おっちら、かかえておうちへ帰りました。  夕方になってやっと、おじいさんは山からしばを背負って帰って来ました。  「おばあさん、今帰ったよ。」",
          "AMAZON_BEDROCK_METADATA": """{"source":"s3://bin-embed-us-west-2-kb-datasource-123456789012/桃太郎第1章.txt"}""",
          "x-amz-bedrock-kb-data-source-id": "EROKQZRX7T",
          "bin-embed-vector": [
            82,
            -115,
            34,
            46,
            60,
            -97,
            -91,
            43,
            -111,
            -52,
            122,
            54,
            38,
            -34,
            107,
            -110,
            28,
            -58,
            -71,
            78,
            119,
            -26,
            -15,
            -62,
            -16,
            58,
            -42,
            -86,
            116,
            67,
            -116,
            -95,
            -30,
            93,
            -25,
            103,
            66,
            -35,
            -85,
            -43,
            -8,
            95,
            70,
            -67,
            -16,
            -32,
            119,
            93,
            -57,
            100,
            90,
            12,
            -95,
            -55,
            -102,
            -120,
            69,
            110,
            -15,
            8,
            27,
            -62,
            -51,
            -28,
            -19,
            -116,
            48,
            -38,
            -113,
            24,
            70,
            6,
            107,
            -78,
            61,
            84,
            -17,
            -4,
            126,
            20,
            34,
            -71,
            21,
            105,
            27,
            -61,
            -85,
            -20,
            123,
            -124,
            -44,
            -3,
            -84,
            36,
            -55,
            -125,
            7,
            66,
            -74,
            -60,
            -98,
            43,
            -32,
            92,
            -60,
            46,
            94,
            106,
            -51,
            -68,
            -10,
            7,
            -26,
            17,
            107,
            -56,
            -57,
            -59,
            -44,
            39,
            18,
            -86,
            -127,
            -120,
            -66,
            14,
            100,
            -60
          ],
          "id": "74c8a859-b258-4721-a211-be751816469a"
        }
      },
      {
        "_index": "bin-embed-vector-index",
        "_id": "1%3A0%3A6mitYJMBfoaIBvsnsP0C",
        "_score": 1,
        "_source": {
          "x-amz-bedrock-kb-source-uri": "s3://bin-embed-us-west-2-kb-datasource-123456789012/桃太郎第1章.txt",
          "AMAZON_BEDROCK_TEXT": "「いいえ、買って来たのではありません。今日川で拾って来たのですよ。」  「え、なに、川で拾って来た。それはいよいよめずらしい。」  こうおじいさんは言いながら、桃を両手にのせて、ためつ、すがめつ、ながめていますと、だしぬけに、桃はぽんと中から二つに割れて、  「おぎゃあ、おぎゃあ。」  と勇ましいうぶ声を上げながら、かわいらしい赤さんが元気よくとび出しました。  「おやおや、まあ。」  おじいさんも、おばあさんも、びっくりして、二人いっしょに声を立てました。",
          "AMAZON_BEDROCK_METADATA": """{"source":"s3://bin-embed-us-west-2-kb-datasource-123456789012/桃太郎第1章.txt"}""",
          "x-amz-bedrock-kb-data-source-id": "EROKQZRX7T",
          "bin-embed-vector": [
            112,
            -115,
            43,
            63,
            -76,
            -33,
            -107,
            111,
            -109,
            77,
            106,
            22,
            6,
            -1,
            15,
            26,
            40,
            -116,
            125,
            -54,
            -17,
            5,
            -75,
            14,
            -19,
            49,
            -90,
            -124,
            84,
            83,
            -128,
            -79,
            46,
            -107,
            -14,
            71,
            2,
            -40,
            -90,
            93,
            -57,
            75,
            -58,
            57,
            -39,
            -92,
            -41,
            -89,
            -108,
            102,
            -103,
            12,
            -112,
            -87,
            -14,
            76,
            8,
            -102,
            89,
            16,
            93,
            79,
            -9,
            -34,
            -17,
            40,
            -80,
            18,
            106,
            -71,
            70,
            66,
            82,
            -12,
            -67,
            -40,
            -17,
            -42,
            -82,
            60,
            112,
            -67,
            29,
            43,
            63,
            -21,
            -113,
            -24,
            104,
            -98,
            -108,
            -3,
            -79,
            33,
            -31,
            -74,
            -81,
            107,
            -100,
            86,
            -99,
            105,
            -14,
            -16,
            29,
            6,
            114,
            96,
            123,
            -4,
            23,
            6,
            -110,
            119,
            82,
            -22,
            93,
            -108,
            -10,
            77,
            22,
            -98,
            5,
            16,
            -6,
            50,
            50,
            124
          ],
          "id": "8a00c71b-ce3e-45bc-9d1a-4304f87b4959"
        }
      }
    ]
  }
}
エンべディングモデル実行時は 2 進数でしたが、ベクトルデータベースに保管する際には OpenSearch の仕様で int8 (-128, 127) に変換されるようです。
You must convert your binary data into 8-bit signed integers (int8) in the [-128, 127] range. For example, the binary sequence of 8 bits 0, 1, 1, 0, 0, 0, 1, 1 must be converted into its equivalent byte value of 99 to be used as a binary vector input.
https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector#binary-vectors
問題なく検索できていますね。
 まとめ以上、「Amazon Bedrock Knowledge bases が Binary Embedding をサポートしました。」でした。
大量の文書を RAG として読み込むケースなどには有効になりそうです。
re:Rank などある程度のチューニングが終わったのちに、コスト感が目立ってきた際の手段の 1 つとして、使ってみる温度感が良いのではないかと思いました。
このブログがどなたかの参考になれば幸いです。AWS 事業本部コンサルティング部のたかくに（@takakuni_）でした！